AITopics

Neural Information Processing SystemsFeb-11-2026, 23:21:23 GMT

cd556f38dba3a6c367c42fa85fc0801c-Paper-Datasets_and_Benchmarks.pdf

computational linguistic, erroneous span, tgea 2, (15 more...)

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Maryland > Baltimore (0.04)
(9 more...)

Genre: Research Report (0.46)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)

Neural Information Processing SystemsFeb-8-2026, 17:34:41 GMT

Transcormer: TransformerforSentenceScoringwith SlidingLanguageModeling

Sentence scoring aims at measuring the likelihood score of a sentence and is widely usedinnatural language processing scenarios, likereranking, which isto select the best sentence from multiple candidates.

artificial intelligence, machine learning, natural language, (17 more...)

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(15 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsAug-22-2025, 01:36:34 GMT

cd556f38dba3a6c367c42fa85fc0801c-Paper-Datasets_and_Benchmarks.pdf

artificial intelligence, machine learning, natural language, (19 more...)

Country:

Asia > China > Tianjin Province > Tianjin (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Maryland > Baltimore (0.04)
(9 more...)

Genre: Research Report (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.46)

Neural Information Processing SystemsAug-14-2025, 15:12:16 GMT

486ff0b164cf92b0255fe39863bcf99e-Paper-Conference.pdf

bidirectional context, computational linguistic, probability, (12 more...)

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(15 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

arXiv.org Artificial IntelligenceAug-14-2025

Global Convergence Analysis of Vanilla Gradient Descent for Asymmetric Matrix Completion

Zhang, Xu, Chen, Shuo, Li, Jinsheng, Pang, Xiangying, Gong, Maoguo

This paper investigates the asymmetric low-rank matrix completion problem, which can be formulated as an unconstrained non-convex optimization problem with a nonlinear least-squares objective function, and is solved via gradient descent methods. Previous gradient descent approaches typically incorporate regularization terms into the objective function to guarantee convergence. However, numerical experiments and theoretical analysis of the gradient flow both demonstrate that the elimination of regularization terms in gradient descent algorithms does not adversely affect convergence performance. By introducing the leave-one-out technique, we inductively prove that the vanilla gradient descent with spectral initialization achieves a linear convergence rate with high probability. Besides, we demonstrate that the balancing regularization term exhibits a small norm during iterations, which reveals the implicit regularization property of gradient descent. Empirical results show that our algorithm has a lower computational cost while maintaining comparable completion performance compared to other gradient descent algorithms.

artificial intelligence, hypothesis 1, machine learning, (15 more...)

2508.09685

Country:

Asia > China > Shaanxi Province > Xi'an (0.04)
Asia > China > Hong Kong (0.04)
North America > United States (0.04)
(4 more...)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

arXiv.org Artificial IntelligenceJun-26-2025

Distilling A Universal Expert from Clustered Federated Learning

Leng, Zeqi, Zhang, Chunxu, Long, Guodong, Xia, Riting, Yang, Bo

Clustered Federated Learning (CFL) addresses the challenges posed by non-IID data by training multiple group- or cluster-specific expert models. However, existing methods often overlook the shared information across clusters, which represents the generalizable knowledge valuable to all participants in the Federated Learning (FL) system. To overcome this limitation, this paper introduces a novel FL framework that distills a universal expert model from the knowledge of multiple clusters. This universal expert captures globally shared information across all clients and is subsequently distributed to each client as the initialization for the next round of model training. The proposed FL framework operates in three iterative steps: (1) local model training at each client, (2) cluster-specific model aggregation, and (3) universal expert distillation. This three-step learning paradigm ensures the preservation of fine-grained non-IID characteristics while effectively incorporating shared knowledge across clusters. Compared to traditional gradient-based aggregation methods, the distillation-based model aggregation introduces greater flexibility in handling model heterogeneity and reduces conflicts among cluster-specific experts. Extensive experimental results demonstrate the superior performance of the proposed method across various scenarios, highlighting its potential to advance the state of CFL by balancing personalized and shared knowledge more effectively.

artificial intelligence, federated learning, machine learning, (18 more...)

2506.20285

Country:

North America > United States > Virginia (0.04)
Asia > Mongolia (0.04)
Asia > China > Jilin Province (0.04)
Asia > China > Inner Mongolia > Hohhot (0.04)

Genre: Research Report (1.00)

Industry:

Education (0.47)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceMar-15-2025

MSCMHMST: A traffic flow prediction model based on Transformer

Geng, Weiyang, Pan, Yiming, Xing, Zhecong, Liu, Dongyu, Liu, Rui, Zhu, Yuan

This study proposes a hybrid model based on Transformers, named MSCMHMST, aimed at addressing key challenges in traffic flow prediction. Traditional single-method approaches show limitations in traffic prediction tasks, whereas hybrid methods, by integrating the strengths of different models, can provide more accurate and robust predictions. The MSCMHMST model introduces a multi-head, multi-scale attention mechanism, allowing the model to parallel process different parts of the data and learn its intrinsic representations from multiple perspectives, thereby enhancing the model's ability to handle complex situations. This mechanism enables the model to capture features at various scales effectively, understanding both short-term changes and long-term trends. Verified through experiments on the PeMS04/08 dataset with specific experimental settings, the MSCMHMST model demonstrated excellent robustness and accuracy in long, medium, and short-term traffic flow predictions. The results indicate that this model has significant potential, offering a new and effective solution for the field of traffic flow prediction.

artificial intelligence, machine learning, transformer, (16 more...)

2503.1354

Country:

Asia > Mongolia (0.07)
Asia > China > Inner Mongolia > Hohhot (0.05)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.05)
North America > United States > California (0.04)

Genre: Research Report (0.82)

Industry:

Consumer Products & Services > Travel (1.00)
Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

arXiv.org Artificial IntelligenceMar-6-2025

Tgea: An error-annotated dataset and benchmark tasks for text generation from pretrained language models

He, Jie, Peng, Bo, Liao, Yi, Liu, Qun, Xiong, Deyi

In order to deeply understand the capability of pretrained language models in text generation and conduct a diagnostic evaluation, we propose TGEA, an error-annotated dataset with multiple benchmark tasks for text generation from pretrained language models (PLMs). We use carefully selected prompt words to guide GPT-2 to generate candidate sentences, from which we select 47K for error annotation. Crowdsourced workers manually check each of these sentences and detect 12k erroneous sentences. We create an error taxonomy to cover 24 types of errors occurring in these erroneous sentences according to the nature of errors with respect to linguistics and knowledge (eg, common sense). For each erroneous span in PLM-generated sentences, we also detect another span that is closely associated with it. Each error is hence manually labeled with comprehensive annotations, including the span of the error, the associated span, minimal correction to the error, the type of the error, and rationale behind the error. Apart from the fully annotated dataset, we also present a detailed description of the data collection procedure, statistics and analysis of the dataset. This is the first dataset with comprehensive annotations for PLM-generated texts, which facilitates the diagnostic evaluation of PLM-based text generation. Furthermore, we use TGEA as a benchmark dataset and propose a series of automatic diagnosis tasks, including error detection, error type classification, associated span detection, error rationale generation, to further promote future study on the automatic error detection and correction on texts generated by pretrained language models.

computational linguistic, dataset, proceedings, (15 more...)

2503.04232

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)
(16 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment > Sports (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

arXiv.org Artificial IntelligenceMar-3-2025

From Hypothesis to Publication: A Comprehensive Survey of AI-Driven Research Support Systems

Zhou, Zekun, Feng, Xiaocheng, Huang, Lei, Feng, Xiachong, Song, Ziyun, Chen, Ruihan, Zhao, Liang, Ma, Weitao, Gu, Yuxuan, Wang, Baoxin, Wu, Dayong, Hu, Guoping, Liu, Ting, Qin, Bing

Research is a fundamental process driving the advancement of human civilization, yet it demands substantial time and effort from researchers. In recent years, the rapid development of artificial intelligence (AI) technologies has inspired researchers to explore how AI can accelerate and enhance research. To monitor relevant advancements, this paper presents a systematic review of the progress in this domain. Specifically, we organize the relevant studies into three main categories: hypothesis formulation, hypothesis validation, and manuscript publication. Hypothesis formulation involves knowledge synthesis and hypothesis generation. Hypothesis validation includes the verification of scientific claims, theorem proving, and experiment validation. Manuscript publication encompasses manuscript writing and the peer review process. Furthermore, we identify and discuss the current challenges faced in these areas, as well as potential future directions for research. Finally, we also offer a comprehensive overview of existing benchmarks and tools across various domains that support the integration of AI into the research process. We hope this paper serves as an introduction for beginners and fosters future research. Resources have been made publicly available at https://github.com/zkzhou126/AI-for-Research.

computational linguistic, corr, language model, (14 more...)